115 research outputs found
Protein Coding Gene Nucleotide Substitution Pattern in the Apicomplexan Protozoa Cryptosporidium parvum and Cryptosporidium hominis
Cryptosporidium parvum and C. hominis are related protozoan pathogens which infect the intestinal epithelium of humans and other vertebrates. To explore the evolution of these parasites, and identify genes under positive selection, we performed a pairwise whole-genome comparison between all orthologous protein coding genes in C. parvum and C. hominis. Genome-wide calculation of the ratio of nonsynonymous versus synonymous nucleotide substitutions (dN/dS) was performed to detect the impact of positive and purifying selection. Of 2465 pairs of orthologous genes, a total of 27 (1.1%) showed a high ratio of nonsynonymous substitutions, consistent with positive selection. A majority of these genes were annotated as hypothetical proteins. In addition, proteins with transmembrane and signal peptide domains are significantly more frequent in the high dN/dS group
Matt: Local Flexibility Aids Protein Multiple Structure Alignment
Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these “bent” alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of α-helices and β-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins
Bioinformatics of Corals: Investigating Heterogeneous Omics Data from Coral Holobionts for Insight into Reef Health and Resillience
Coral reefs are home to over 2 million species and provide habitat for roughly 25% of all marine animals, but they are being severely threatened by pollution and climate change. A large amount of genomic, transcriptomic and other -omics data from different species of reef building corals, the uni-cellular dinoagellates, plus the coral microbiome (where corals have possibly the most complex microbiome yet discovered, consisting of over 20,000 different species), is becoming increasingly available for corals. This new data present an opportunity for bioinformatics researchers and computational biologists to contribute to a timely, compelling, and urgent investigation of critical factors that influence reef health and resilience. This paper summarizes the content of the Bioinformatics of Corals workshop, that is being held as part of PSB 2021. It is particularly relevant for this workshop to occur at PSB, given the abundance of and reliance on coral reefs in Hawaii and the conference’s traditional association with the region
Going the distance for protein function prediction: a new distance metric for protein interaction networks
Due to an error introduced in the production process, the x-axes in the first panels of Figure 1 and Figure 7 are not formatted correctly. The correct Figure 1 can be viewed here: http://dx.doi.org/10.1371/annotation/343bf260-f6ff-48a2-93b2-3cc79af518a9In protein-protein interaction (PPI) networks, functional similarity is often inferred based on the function of directly interacting proteins, or more generally, some notion of interaction network proximity among proteins in a local neighborhood. Prior methods typically measure proximity as the shortest-path distance in the network, but this has only a limited ability to capture fine-grained neighborhood distinctions, because most proteins are close to each other, and there are many ties in proximity. We introduce diffusion state distance (DSD), a new metric based on a graph diffusion property, designed to capture finer-grained distinctions in proximity for transfer of functional annotation in PPI networks. We present a tool that, when input a PPI network, will output the DSD distances between every pair of proteins. We show that replacing the shortest-path metric by DSD improves the performance of classical function prediction methods across the board.MC, HZ, NMD and LJC were supported in part by National Institutes of Health (NIH) R01 grant GM080330. JP was supported in part by NIH grant R01 HD058880. This material is based upon work supported by the National Science Foundation under grant numbers CNS-0905565, CNS-1018266, CNS-1012910, and CNS-1117039, and supported by the Army Research Office under grant W911NF-11-1-0227 (to MEC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Recommended from our members
BETASCAN: Probable -amyloids Identified by Pairwise Probabilistic Analysis
Amyloids and prion proteins are clinically and biologically important -structures, whose supersecondary structures are difficult to determine by standard experimental or computational means. In addition, significant conformational heterogeneity is known or suspected to exist in many amyloid fibrils. Recent work has indicated the utility of pairwise probabilistic statistics in -structure prediction. We develop here a new strategy for -structure prediction, emphasizing the determination of -strands and pairs of -strands as fundamental units of -structure. Our program, BETASCAN, calculates likelihood scores for potential -strands and strand-pairs based on correlations observed in parallel -sheets. The program then determines the strands and pairs with the greatest local likelihood for all of the sequence's potential -structures. BETASCAN suggests multiple alternate folding patterns and assigns relative a priori probabilities based solely on amino acid sequence, probability tables, and pre-chosen parameters. The algorithm compares favorably with the results of previous algorithms (BETAPRO, PASTA, SALSA, TANGO, and Zyggregator) in -structure prediction and amyloid propensity prediction. Accurate prediction is demonstrated for experimentally determined amyloid -structures, for a set of known -aggregates, and for the parallel -strands of -helices, amyloid-like globular proteins. BETASCAN is able both to detect -strands with higher sensitivity and to detect the edges of -strands in a richly -like sequence. For two proteins (A and Het-s), there exist multiple sets of experimental data implying contradictory structures; BETASCAN is able to detect each competing structure as a potential structure variant. The ability to correlate multiple alternate -structures to experiment opens the possibility of computational investigation of prion strains and structural heterogeneity of amyloid. BETASCAN is publicly accessible on the Web at http://betascan.csail.mit.edu
- …